Analysis of the Factors Affecting the Interval between Blood Donations Using Log-Normal Hazard Model with Gamma Correlated Frailty

Background: Time to donating blood plays a major role in a regular donor to becoming continues one. The aim of this study was to determine the effective factors on the interval between the blood donations. Methods: In a longitudinal study in 2008, 864 samples of first-time donors in Shahrekord Blood Transfusion Center, capital city of Chaharmahal and Bakhtiari Province, Iran were selected by a systematic sampling and were followed up for five years. Among these samples, a subset of 424 donors who had at least two successful blood donations were chosen for this study and the time intervals between their donations were measured as response variable. Sex, body weight, age, marital status, education, stay and job were recorded as independent variables. Data analysis was performed based on log-normal hazard model with gamma correlated frailty. In this model, the frailties are sum of two independent components assumed a gamma distribution. The analysis was done via Bayesian approach using Markov Chain Monte Carlo algorithm by OpenBUGS. Convergence was checked via Gelman-Rubin criteria using BOA program in R. Results: Age, job and education were significant on chance to donate blood (P<0.05). The chances of blood donation for the higher-aged donors, clericals, workers, free job, students and educated donors were higher and in return, time intervals between their blood donations were shorter. Conclusions: Due to the significance effect of some variables in the log-normal correlated frailty model, it is necessary to plan educational and cultural program to encourage the people with longer inter-donation intervals to donate more frequently.


Introduction
lood and blood products play a big part in saving patients' lives.Regarding the increase in the consumption of blood because of various reasons, it is necessary that donors increase as well 1 .Providing sufficient and healthful blood is up to blood transfusion centers.Unless enough healthy blood is provided, the society's health will be at stake.Therefore, those in charge of health care in society are looking for ways of supplying blood and preventing lack of it in blood banks.For this end, detecting and attracting constant donors is of great importance 2 .Constant blood donors, according to the standards of the Blood Transfusion Organization, are those who donate blood at least twice a year.Therefore, the blood transfusion centers try to increase the number of constant donors in order to ample blood supply for the patients in need of it 3 .
When it is possible for a person to experience something several times, a recurrent event takes place.The data obtained from the repetition of these events are called recurrent events data.Blood donation is considered a recurrent event in survival analysis 4 .In recurrent events survival data, due to individual differences and the effect of previous events, there is a correlation between survival time intervals 5 .Owing to the correlation between the independent survival recurrent times, hazard regression models with independent data cannot be applied and it is necessary that a model capable of covering this correlation is used [6][7][8] .One way of handling this problem is to use a frailty models.In this model, a common random effect called frailty effect is given to the members of each group later multiplied in the hazard function of each group member 8 .
Since the log-normal hazard model with correlated frailty is very complex, Bayesian approach was used to estimate parameters of this model.In this approach, the prior knowledge about the parameters and likelihood is used to produce the posterior distribution, which represents total information about the parameters after the data have been observed 9 .Due to the complexities of frailty models, it is not possible to calculate the posterior distribution of the parameters in an analytic way 10 .Therefore, in order to do a Bayesian analysis, it is necessary to estimate the posterior B distribution of parameters using Markov Chain Monte Carlo (MCMC) method.Successive sampling of full conditional distributions of parameters produces a Markov Chain; so that after convergence these samples can be assumed as dependent samples from marginal posterior distributions of parameters and based on them inference about parameters of interest can be done.In recent years, by using MCMC methods, complex and extensive models for any sample size have been conducted and precise estimations have been yielded 11 .
Log-normal hazard model is one of important applicable model in survival analysis which belongs to the generalized gamma parametric survival model 12 .In the log-normal model, the hazard rate increase from zero rapidly to a peak and then decrease gradually, so that it is a unimodal with comparatively long tail in the right 13,14 .In a primary study on some parametric models without frailty, the log-normal model showed a better fit on our data compare to exponential, Weibull, log-logistic and generalized gamma models.Getting the log-normal model to fitting the interval between blood donations is reasonable because immediately after a successful donation, the chance to another donation is zero, then the donation chance increase with time, and after passing a long time from previous donation, the chance decline.
The aim of this study was to determine the effective factors on the interval between the blood donations in a sample consisting of first-time blood donors based on lognormal hazard model with correlated frailty.

Methods
A subset of the data related to the time intervals between return to donation were used 15 .This subset included 424 out of 864 samples that had had at least two successful donations (which mean that the samples have at least two recorded survival times).In this case at least one uncensored time interval was existed for each donor.This data was obtained from a longitudinal study, which included a group of donors referred to Shahrekord Blood Transfusion Center, capital city of Chaharmahal and Bakhtiari Province, Iran for the first time in 2008.Their statistical behavior was followed for five years till 2013 and the information recorded in Negareh Software in this transfusion center was used for analysis.This information included each person's date of donations, donation status (successful or deferral), and each person's demographic variables.The time interval between two donations was defined as survival time.Therefore, the time between the first and second donations was regarded as the first survival time.If no second donation had been recorded until the end of the study, the time between the first donation and the end of the study was considered as the censored time.
The response variable is defined as the number of the days between two subsequence donations for each person.In order to deal with the factors influencing these time intervals, age, sex, body weight, marital status, stay, education, and job have been entered into the hazard model as independent variables.
A series of unknown factors can influence the time intervals between donations including donor's related and specific donation related.Therefore, a correlation between each person's survival times is expected, which is the time interval between two donations for that person.Accordingly, a random variable called correlated frailty is considered for each person in order to recognize the correlation of the time interval between two donations of that person then the hazard function is multiplied 16 .The correlated frailty ( ij ) consists of shared frailty (W i ) plus individual frailty (Z ij ), which are independent from each other; moreover, by considering their values, it can be said to what extent the survival time of each sample has been influenced by either common unknown factors or specific ones 16 .In this study, it was supposed that frailties were of gamma distributions with θ scale; hence, given the independent nature of frailty terms from each other, correlated frailties will also have gamma distributions.As a result, shared frailty is of a gamma distribution with φ, θ parameters and individual frailty is of a gamma distribution with θ − φ, θ parameters provided that 0<  < .Then, the correlated frailty ( ij ) has a gamma distribution with θ, θ parameters with mean one and variance of θ −1 .Since the survival times for each person are correlated, the correlated frailty model was used.Therefore, regarding the frailty in the model, the survival times for each person were supposed to be independent (from each other) 16 .A full presentation of correlated frailty model for baseline lognormal hazard model is given in the appendix.
In order to modeling the baseline hazard function, lognormal hazard model was used.For analyzing return to donation, data three models were used as log-normal hazard model without frailty, log-normal hazard model with gammashared frailty, and log-normal hazard model with gamma correlated frailty.Furthermore, the estimation of parameters was done using Bayesian analysis and Markov Chain Monte Carlo (MCMC) method.Considering the complexity of posterior distribution calculations, MCMC method is used to estimate the parameters for any sample size 11 .
For prior distribution of the model parameters, first the log-normal hazard model applying non-informative prior of regression coefficient was carried out and later their estimate was used for determining informative prior distribution in the shared frailty and correlated frailty models.Therefore for the parameters of regression, the prior for the coefficient of age assume as normal distribution with mean 1 and variance 100, sex as normal distribution with mean 1 and variance 4, body weight as normal distribution with mean 0 and variance 100, education as normal distribution with mean 1 and variance 4, job as normal distribution with mean 1 and variance 4, marital status as normal distribution with mean 0.1 and variance 4, stay as normal distribution with mean 1 and variance 2 and a  0 (constant value) as normal distribution with mean 1 and variance 4 were used.The prior distribution for frailty parameter of () as uniform distribution on (0, θ) and for frailty parameters of (θ) as gamma distribution ( = 0.5, β = 10) were used as well.The parameters were estimated using OpenBUGS 17 software version 3.2.3.In order to make sure of the coverage of Monte Carlo simulations, Gelman-Rubin convergence criteria via BOA (Bayesian Output Analysis) program in R software version 3, 2, 0 was used 18 .Comparing between log-normal hazard model without frailty, log-normal hazard model with gamma shared frailty, and log-normal hazard model with gamma correlated frailty was done based on deviance information criterion (DIC) 19 .

Results
Out of the 864 samples, 424 samples that had at least two successful donations were enrolled in this study.From them, 404 people (95.3%) were male.The donors' age at the first donation was from 21 to 69 yr with mean of 36.1 ±10.2 yr.Their body weight at the first donation was from 45 to 120 kg with the mean of 80.2 ±11.6 kg.Only one case with body weight lower than 50 kg was seen.Overall, 306 people (72.2%) were married at their first donation and the rest were single and 344 (81.1%) lived in the city.Number of donation for each donor was varied from 2 to 13 donations.The frequency distribution of donors' characteristics is shown in Table 1.The time between recurrent donations was a response variable.The number of donations in each interval, the mean of time interval between donations and the censoring rate for each time interval is given in Table 2.The results of fitting the log-normal hazard model with gamma correlated frailties, including mean, median, standard deviation, and 95% credible intervals, based on 30,000 simulated values after considering 5000 samples as burn-in period, is shown in the Table 3.Since there was a very high autocorrelation in the successive values of the simulated observation, every 50-th sample was monitored.The estimate of Gelman-Rubin convergence criteria is shown in the last column of Table 3.These values are very close to one ensuring the convergence of all parameters of model.In our data, DIC for log-normal hazard model without frailty was 32814800, for the log-normal hazard model with gamma shared frailty was 32810000, and for the log-normal hazard model with gamma correlated frailty was 32210000, which showed a better fitting for the log-normal hazard model with gamma correlated frailty.Based on estimate of  and θ parameters, the estimate of variance of frailty random effect is 0.41, the mean of shared frailty is 0.66, the mean of individual frailty is 0.34, and the coefficient correlation of shared and individual frailty is 0.66.The parameters of regression coefficients of body weight, marital status, sex, and stay were not significant, while age had a positive effect on blood donation hazard function, which means as the age increases, the chances of blood donation increase as well and the time intervals between donations decrease (Table 3).The variable of education had a significant impact on blood donation (P<0.05):university students and people with diploma education had a higher chance of donation compare with people with elementary education, i.e. they have returned to donation sooner than people with elementary education and their time intervals between donations were consequently shorter.Regarding job, all jobs including clerks, workers, the free job, and students and unemployed were significant against housewives, which mean these jobs donated more than housewives and their intervals were shorter.

Discussion
Blood and its products are very critical in saving some people live especially patients.Therefore, detecting and attracting potential continues donors are of great importance 1 .The response variable in this study was the time intervals between donations which were calculated on the number of days and it was learned that with increase in the number of donations, the average time interval between two donations decreases.Due to unknown and immeasurable factors such as genetic, environmental and physical factors and people's insight toward blood donation, there is dependence between time intervals of blood donations for each donor.In this study, for modeling the survival times, the baseline lognormal hazard model was used, however, because of the correlation between each person's survival times, the correlated frailty was used 4 .The correlated frailties in the log-normal hazard model show correlation between donation time intervals for each person 6 .The correlation indicates that a series of unknown factors influence on the interval times between donations for each person and that person has donated blood either in long or in short time intervals.
In this dataset, the log-normal gamma correlated frailty model indicated that the variable of age had a positive impact on the hazard function of time intervals between donations, which means that as the age increases, the chances of donation increase and as a result time intervals between two donations accordingly decrease.Therefore, by raising awareness and giving appropriate training, the young donors can become continues donors.Donation rate increased as age increases 20,21 .However, age had minor significance and had a negative effect on the chances of blood donation 22 ; the same findings were achieved in James and Matthew's study 23 , but in another study 24 , age had no impact on return behavior.Job influenced the time intervals between donations statistically, which indicates that housekeeper's chances for donation were lower than other jobs.The same result was found in another study 3 , i.e. the variable of job was significant.The number of donors who were clerks or free jobs was more than other jobs.Repeated visits from universities to blood transfusion centers can be organized in order to attract more university students for donating blood.University students and people with diploma education had a higher chance of donation compare with people with elementary education, i.e. they have returned to donation sooner than people with elementary education and their time intervals between donations were consequently shorter.Education had a positive effect on return behavior 3 while the variable of education had no effect on the time intervals between donations 21,25,26 .Since training plays a major role in continuity of the donation, planning instructive programs that fit different levels of education is a big step in enhancing the blood donation culture.In our previous works on the total dataset 15,22 , body weight had significant effect on the time interval between donations, such that the time interval between donations was shorter for donors with higher body weight.However, in this subset of data, body weight did not show any significant effect on the time interval between donations.
In the transfusion data framework, the correlated effect is sum of donor random effect (known as shared frailty) and specific donation random effect (known as individual frailty).The mean estimate of donor random effect in our data was almost 2 times of specific donation random effect.This results show that effect of donor's unknown factors are higher than specific donation's unknown factors.The correlation of interval times between donations of each donor means that each donor returns to donation in a roughly equal interval which show stability in his/her return to donation behavior.
The correlated frailty was used in many applications 27,28 .Here we used it to explain unknown random effect in the recurrent event survival data.Log-normal hazard model for survival analysis is complex especially in present of censoring 29,13 .Its complexity increased by applying the random effect in the model.This problem causes the difficulty in convergence of simulated samples and encountering a high autocorrelation between subsequent samples of simulated values.In running of MCMC algorithm, one sample was monitored from every 50 produced samples to reduce autocorrelation between subsequent samples and obtaining convergence.Although a large development in computational speed has been created, Bayesian analyzing of such model is very time-consuming presently.
In this study, it was presupposed that components of frailty were of gamma distribution.Research can also be done on log-normal hazard model with correlated frailty having Inverse Gaussian and Positive Stable distributions.

Conclusions
Due to the significance effect of some variables in the log-normal correlated frailty model, it is necessary to plan educational and cultural program to encourage the people with longer inter-donation intervals to donate more frequently and hence these people will become constant donors.

Table 1 :
Some characteristics of donors in the study (n= 424)

Table 3 :
Posterior summaries for the parameters of log-normal correlated frailty model